Inducing Controlled Error over Variable Length Ranked Lists

نویسندگان

  • Laurence Anthony F. Park
  • Glenn Stone
چکیده

When examining the robustness of systems that take ranked lists as input, we can induce noise, measured in terms of Kendall’s tau rank correlation, by applying a set number of random adjacent transpositions. The set number of random transpositions ensures that any ranked lists, induced with this noise, has a specific expected Kendall’s tau. However, if we have ranked lists of varying length, it is not clear how many random transpositions we must apply to each list to ensure that we obtain a consistent expected Kendall’s tau across the collection. In this article we investigate how to compute the number of random adjacent transpositions required to obtain an expected Kendall’s tau for a given list length, and find that it is infeasible to compute for lists of length more than 9. We also investigate an alternate and more efficient method of inducing noise in ranked lists called Gaussian Perturbation. We show that using this method, we can compute the parameters required to induce a consistent level of noise for lists of length 10 in just over six minutes. We also provide an approximate solution to provide results in less than 10−5 seconds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient motif search in ranked lists and applications to variable gap motifs

Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their c...

متن کامل

CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists

We participated in two tasks: Multi-8 two-years-on retrieval and Multi-8 results merging. For our multi-8 two-years-on retrieval work, simple multilingual ranked lists are first built by merging ranked lists of different languages that are generated by single types of retrieval algorithms. Then, algorithms are proposed to combine these simple multilingual ranked lists into a single ranked list....

متن کامل

Stability of Ranked Gene Lists in Large Microarray Analysis Studies

This paper presents an empirical study that aims to explain the relationship between the number of samples and stability of different gene selection techniques for microarray datasets. Unlike other similar studies where number of genes in a ranked gene list is variable, this study uses an alternative approach where stability is observed at different number of samples that are used for gene sele...

متن کامل

Evaluation of Personalized Concept-Based Search and Ranked Lists over Linked Open Data

Linked Open Data (LOD) provides a rich structured data. As the size of LOD grows, accessing the right information becomes more challenging. Especially, the commonly used ranked lists presentation of current LOD search engines is not effective for search tasks in unfamiliar domains. Recently, combination of clustering and personalized search gained more attention for this purpose. In this paper,...

متن کامل

VLSI implementation of a reversible variable length encoder/decoder

Variable Length Codes (VLCs) are known for their efficient compression , but are susceptible to noisy environments due to synchronization losses that can occur from bit error propagation. Recent interest in Reversible Variable Length Codes (RVLCs) has come about due to the growing need for wireless exchange of compressed image and video signals over noisy channels and the ability for RVLCs to p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014